Human Genetics and Genomics Advances — Latest Matching Preprints

1

Prioritizing embryos with lower homozygosity may reduce disease risk in children of related individuals undergoing preimplantation genetic testing

Wolfram, T.; Ahangari, M.; Davidson, I.; Wartschinski, L.; Li, J. H.; Eyre, M.; Stern, D.; Schleede, J.; Haghighi, A.; Carmi, S.; Christensen, M.

2026-06-04 genetic and genomic medicine 10.64898/2026.05.30.26354526 medRxiv

Top 0.1%

4.0%

Show abstract

Consanguinity is a reproductive union between individuals who share a recent common ancestor. These unions are common in many regions of the world and increase the burden of rare recessive disorders by elevating autozygosity in offspring. Current reproductive genetic screening focuses on a limited set of known pathogenic variants, leaving most recessive risk unaddressed. Here we argue that embryo-level autozygosity, quantified as the fraction of the genome in long runs of homozygosity (FROH), is a potentially actionable genomic biomarker that can be integrated into routine preimplantation genetic testing as a homozygosity-informed embryo-prioritization framework (PGT-H) that can be layered onto existing embryo biopsy workflows when couples are already undergoing IVF with PGT-A or PGT-M. Using forward simulations of first-cousin and double-first-cousin couples, we show that siblings conceived by the same couple span a wide range of FROH; selecting the lowest-FROH candidate from a cohort of five embryos reduces FROH by approximately 40% on average. Combining these reductions with empirical effect-size estimates, we estimate that for first-cousin couples this strategy could reduce risk of intellectual disability by roughly 35-45% (corresponding to an absolute risk reduction of about 1.8-2.2%) and potentially reduce excess recessive disease burden, while also modestly reducing risk of common diseases such as type 2 diabetes. We outline how existing PGT-A and PGT-M workflows could potentially be extended to report embryo-level FROH and discuss ethical and counseling considerations. Autozygosity-based embryo prioritization offers a principled way to address a component of recessive risk that current variant-centric approaches miss.

2

Contextualizing the Utility of Polygenic Risk Scores using Absolute Risk Models in Diverse Ancestry Populations

Chatterjee, N.; Martina, F.; Kachuri, L.; Natarajan, P.; Witte, J.; Huo, D.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354842 medRxiv

Top 0.1%

3.5%

Show abstract

Polygenic risk scores (PRSs) are emerging as powerful tools for quantifying inherited risk for common diseases and, in some cases, are approaching clinical implementation. A major concern for PRS implementation is their limited accuracy in non-European populations, particularly in those of African ancestry. However, past evaluations have focused on metrics such as relative risk or AUC, which do not capture background risk arising from contextual factors. We introduce a novel measure of variable importance, the conditional average derivative estimator (CADE), to evaluate PRS utility across diverse contexts and populations within absolute risk models that integrate PRSs with other relevant risk factors. We illustrate this framework by integrating PRSs for breast and prostate cancer within age-specific absolute risk models for incidence and mortality fit using individual-level data from the All of Us Research Program with inputs from the National Cancer Institute SEER cancer registry. Our projections show that although the PRSs are known to have the lowest discriminatory accuracy in African Americans (AA), there are contexts in which they provide greater utility, such as for the stratification of prostate cancer risk and mortality, where the CADE values for AA were 2- and 7-fold higher than for European Americans. These findings suggest that conclusions about the limited clinical utility of PRS in non-European populations may be premature and underscore the need to quantify PRS risk-stratification utility at the absolute-risk level, while accounting for disease onset, survival, and broader health and economic factors.

3

Breast cancer polygenic risk score performance varies by socioeconomic status

Domian, H. I.; Tian, X.; Ong, D.; Hamilton, L.; Shieh, Y.; Musharoff, S. A.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354819 medRxiv

Top 0.4%

1.2%

Show abstract

Background: Polygenic risk scores (PRS) for breast cancer are increasingly used for risk stratification to inform screening and prevention. However, for PRSs to be equitable and clinically useful, they need to perform well across diverse populations. While PRS performance is known to be ancestry-dependent, it is not well understood how environmental context, such as that of socioeconomic status (SES), affects PRS transferability. Here, we assess whether SES, measured via self-reported household income, modifies breast cancer PRS performance and, if so, whether socioeconomic context contributes predictive information beyond genetic risk alone. Methods: We used the US-based All of Us biobank to evaluate how SES impacts breast cancer PRS performance. First, we quantified changes in breast cancer PRS performance by modeling a commonly-cited polygenic score for breast cancer previously described by Mavaddat et al. with SES. We then reestimated the genetic effect sizes of the 3,820 variants from Mavaddat et al. in All of Us with and without income as a covariate. Because social determinants of health affect breast cancer detection and outcomes, we stratified analyses by socially defined populations on the basis of self-identified race and ethnicity. We further stratified individuals whose self-identified race is White (''White'') into three SES groups (high, middle, low) based on self-reported income and re-estimated genetic effect sizes to create SES-specific PRSs. We then applied these PRSs to White participants, the largest group in the study, and to Black or African American (''Black'') and Hispanic or Latino (''Hispanic'') participants, groups underrepresented in breast cancer research. Model discrimination between cases and controls was measured by area under the curve (AUC). Results: We analyzed 163,715 women from the All of Us biobank, which included 8,833 breast cancer cases (6,619 White, 1,178 Black, and 1,036 Hispanic), with relative income available for a subset of these cases (5,525 White, 848 Black, and 566 Hispanic). The ancestry-dependent performance of the breast cancer PRS described in Mavaddat et al. was replicated in All of Us. In Black individuals, this PRS (AUC and 95% CI: 0.576 [0.571, 0.582]) produced a similar increase in AUC as relative income (AUC: 0.573 [0.568, 0.577]) when added to an age-only model. Incorporating income with PRS, age, and genetic PCs 1-3 improved AUC by 0.007 in White Americans and 0.018 in Black Americans (both p < 10-11), while attenuating the contribution of PRS in the full model. PRS performance also varied among SES categories. Notably, PRSs with variant effect sizes that were recalibrated in low-SES White participants performed best in low-SES White participants (AUC: 0.605 [0.583, 0.628]) and Black Americans (AUC: 0.588 [0.586, 0.591]), both better than performance in high-SES White Americans (AUC: 0.579 [0.577, 0.580]) and middle-SES White Americans (AUC: 0.578 [0.569, 0.586]). Conclusion: Socioeconomic context, measured by income, significantly impacts the transferability of a PRS for breast cancer within and among groups defined by self-identified race and ethnicity. Accounting for SES improves PRS performance, most notably in Black Americans and low-SES White individuals.

4

STELLAR: A flexible ensemble learning framework integrating rare variants to enhance polygenic risk prediction

Chen, T.; Li, X.; Mazumder, R.; Zhang, H.; Lin, X.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.07.26355109 medRxiv

Top 0.5%

1.2%

Show abstract

Whole-exome and whole-genome sequencing technology has enabled the discovery of rare genetic variants associated with human health and diseases. However, existing statistical methods used for rare variant association testing are not well-suited for building genetic risk prediction models that jointly incorporate rare and common variants. We propose STELLAR, a flexible ensemble learning-based approach to compute rare variant polygenic risk scores (PRS) using association summary statistics to enhance conventional common variant PRS. Our method combines burden-based and penalty-based rare variant analysis and leverages functional annotation information to prioritize potentially causal variants within the prediction models. In simulation studies, PRS using STELLAR consistently showed the highest prediction accuracy compared to models using common variants alone or rare variant burdens. Applied to UK Biobank whole-exome sequencing data (n=310,831) across eight continuous and five binary traits, STELLAR significantly improved prediction accuracy, refined stratification of individuals at the highest genetic risk beyond common variants, and prioritized biologically relevant genes. STELLAR provides a scalable strategy to incorporate rare variants into PRS in addition to common variants, advancing precision risk prediction and enabling more comprehensive assessment of genetic contributions to complex diseases.

5

Investigating the Y chromosome in complex disease: Phenome-wide scan across 104,334 Finnish men

Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355235 medRxiv

Top 0.5%

1.1%

Show abstract

Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [≥] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.

6

Placental molecular subtypes of severe preeclampsia reveal divergent aging trajectories and fetal growth outcomes

Du, Y.; Benny, P. A.; Lahiri, S.; AlAkwaa, F. M.; Huang, Q.; Liu, Y.; Lassiter, C. B.; Astern, J.; Riel, J.; Garmire, L. X.

2026-06-04 sexual and reproductive health 10.64898/2026.06.02.26354756 medRxiv

Top 0.7%

0.8%

Show abstract

Severe preeclampsia (sPE) is a major cause of maternal and fetal morbidity worldwide, yet its placental molecular heterogeneity remains poorly defined by current clinical diagnosis. To resolve the molecular architecture of sPE, here we integrated DNA methylation and proteomic profiling from a multi-ethnical cohort of 444 placentas from the Hawaiian Biorepository (HiBR), including 169 sPE cases, matched preterm controls and full-term controls. To address cellular heterogeneity in bulk placental tissue, we developed HOMED (Hierarchically Optimized Methylation Deconvolution), a single-cell-guided hierarchical framework for inferring placental cell-type composition from DNA methylation data. HOMED-adjusted integrative analyses identified extensive subtype-specific alterations involving hypoxia, angiogenesis, immune activation, trophoblast differentiation and metabolic remodeling. Molecular stratification revealed two reproducible sPE subtypes with divergent placental aging trajectories. One subtype exhibited a pre-mature placental state marked by accelerated placental aging, whereas the other displayed slower accelerated placental aging but a substantially increased risk of small-for-gestational-age birth (P = 0.028). These subtypes were independently replicated across six external cohorts and further supported by proteomic signatures achieving a classification accuracy of 0.88. Integrative epigenomic and proteomic analyses linked the growth-restricted subtype to hypoxia-associated glycolytic remodeling, suggesting distinct pathogenic mechanisms underlying clinically diagnosed sPE. Together, our findings redefine severe preeclampsia as a biologically heterogeneous placental disorder composed of molecularly distinct subtypes with divergent aging trajectories and fetal growth outcomes, providing a framework for mechanism-based stratification and precision obstetric medicine.

7

Age-Related Speech-in-Noise Hearing Loss in Parkinson's Disease and APOE E4 Carriers

Kmiecik, M. J.; Xu, W.; Weldon, C. H.; Guan, A.; McIntyre, M. H.; Bouchard, E. L.; 23andMe Research Team, ; Schneider, R. B.; Auton, A.; Aslibekyan, S.

2026-06-09 neurology 10.64898/2026.06.08.26355175 medRxiv

Top 1%

0.5%

Show abstract

Age-related hearing loss is a leading modifiable risk factor for dementia and is increasingly recognized as a non-motor feature of Parkinson's disease (PD). The apolipoprotein E (APOE) E4 allele is the strongest genetic risk factor for Alzheimer's disease and is associated with cognitive decline in PD, yet its relationship to hearing loss remains unclear. Therefore, we examined the independent and interactive effects of PD status and APOE E4 carrier status on age-related hearing loss using a validated web-based speech-in-noise (SIN) assessment in 239,620 23andMe Research Institute participants without PD and 4,361 PD cases. Generalized additive models for location, scale, and shape (GAMLSS) showed that both PD and APOE E4 independently exacerbated age-related hearing decline, with speech reception thresholds (SRTs) worsening non-linearly with advancing age, but without evidence of synergistic interaction. However, longitudinal analyses in a subcohort completing at least two assessments (1,434 PD cases; 36,242 controls) using GAMLSS mixed models showed a significant three-way interaction between PD status, APOE E4, and age2, such that SIN hearing loss accelerated more steeply with age in APOE E4 carriers with PD. Males and individuals with lower educational attainment also exhibited worse SIN hearing loss. These results identify APOE E4 carriers with PD as a priority population for hearing screening and intervention, and support the integration of SIN assessments into routine PD care to detect hearing decline that may compound cognitive and communicative burden in aging.

8

Natural History of Prenatally Identified Children with 48,XXYY Syndrome in Infancy and Early Childhood

Nocon, K.; Swenson, K.; Bothwell, S.; Howell, S.; Davis, S.; Ikomi, C.; Ross, J.; Tartaglia, N.

2026-06-04 pediatrics 10.64898/2026.06.04.26353909 medRxiv

Top 1%

0.4%

Show abstract

Background: 48,XXYY syndrome is a rare sex chromosome aneuploidy (SCA) characterized by neurodevelopmental deficits and medical comorbidities. The limited information available in the literature is almost exclusively limited to postnatally diagnosed cases. This study aims to describe the early medical and developmental features of prenatally identified 48,XXYY infants, with comparisons to 47,XYY, 47,XXY cohorts, and typical populations, as well as previously reported postnatally diagnosed 48,XXYY cases. Methods: The eXtraordinarY Babies Study prospectively follows children prenatally identified to be at high risk for SCA with annual medical and neurodevelopmental evaluations. Data presented herein include the prevalence of medical conditions, developmental milestones, developmental and adaptive functioning assessment scores, and therapy utilization in participants confirmed to have 48,XXYY. Comparisons were made between this cohort and the typical population, infants with 47,XYY and 47,XXY also enrolled in the eXtraordinarY Babies Study, and a 2008 cohort of individuals postnatally identified 48,XXYY. Results: Infants with 48,XXYY exhibited a range of early medical features, including high rates of feeding and GI disorders (breastfeeding difficulties, gastroesophageal reflux, and eosinophilic esophagitis), allergic disorders (food allergies and environmental allergies), and hypotonia. Developmental and adaptive functioning scores indicated delays in motor, communication, and social domains, with nearly all infants receiving speech therapy, physical and/or occupational therapy. Comparisons with the 47,XYY and 47,XXY cohorts revealed more medical and developmental challenges in the 48,XXYY group, however there was variability and some overlap with both the general population and sex chromosome trisomy conditions. Additionally, comparison to the 2008 postnatally identified 48,XXYY cohort indicated that while prenatal diagnosis allowed for earlier intervention, developmental outcomes in the first years of life were similar between the two groups. Conclusions: 48,XXYY diagnosed prenatally facilitates early monitoring, anticipatory guidance, and proactive referrals for medical evaluations and intervention, given developmental delays and medical challenges are more common in infancy and early childhood compared to the general population and trisomy SCAs. These findings provide valuable insights for genetic counselors and healthcare providers, emphasizing the spectrum of medical and developmental findings and importance of early and proactive care to support individual outcomes. Prospective study of this prenatally identified cohort will provide important natural history and phenotypic variability in XXYY, as well as identification of predictors of health and developmental outcomes.

9

Documented clinical genetic testing among carriers of hereditary breast and ovarian cancer variants: Ancestry and socioeconomic disparities in the All of Us research program

Yerukala Sathipati, S.; Scott, H.

2026-06-10 oncology 10.64898/2026.06.09.26355262 medRxiv

Top 1%

0.4%

Show abstract

Importance: Hereditary breast and ovarian cancer (HBOC) variant carriers benefit from risk-reducing interventions, but only if identified. The extent to which carriers are clinically recognized, and whether recognition is equitable across diverse populations, is poorly characterized in a single large U.S. cohort. Objective: To estimate P/LP HBOC carrier prevalence across genetic ancestry groups, quantify documented clinical genetic testing among carriers, and evaluate ancestry and socioeconomic disparities in testing. Design, Setting, and Participants: Cross-sectional analysis of the All of Us Research Program Controlled Tier (Curated Data Repository v8/C2024Q3R9), comprising participants with short-read whole genome sequencing and linked electronic health record (EHR) and survey data. Carriers were ascertained from research genomic data independent of clinical testing. Exposures: Genetically inferred ancestry (African [AFR], Admixed American [AMR], East Asian [EAS], European [EUR], Middle Eastern [MID], South Asian [SAS]); self-reported household income and educational attainment. Main Outcomes and Measures: (1) Carrier prevalence with Wilson 95% CIs; (2) documented clinical genetic testing (procedure codes) among carriers; (3) adjusted odds of documented testing among women, by ancestry, before and after socioeconomic adjustment, using multivariable logistic regression. Results: Among 414,830 participants, P/LP HBOC carrier prevalence was 1.42% (95% CI, 1.38-1.45) overall and similar across ancestry groups (AFR 1.24%, AMR 1.32%, EAS 1.19%, EUR 1.52%, MID 1.68%, SAS 1.33%; overlapping CIs). Among 250,071 women in the testing analysis, documented clinical genetic testing was rare: only 74 of 5,878 carriers overall (1.3%) and 59 of 3,572 European-ancestry carriers (1.7%) had a documented test, with counts below reportable thresholds in all other ancestry groups. African-ancestry women had lower adjusted odds of documented testing than European-ancestry women (Model 1 adjusted odds ratio [aOR], 0.32; 95% CI, 0.27-0.39), an association that attenuated but persisted after adjustment for income and education (Model 2 aOR, 0.48; 95% CI, 0.40-0.58; P < 0.001); Admixed American women also had reduced adjusted odds (aOR, 0.71; 95% CI, 0.61-0.84). Lower income and lower education were independently and dose-dependently associated with lower testing odds (income <$25,000 aOR, 0.46; high-school education aOR, 0.54). Conclusions and Relevance: High-risk HBOC variant carriers are present across all ancestry groups at similar frequencies, yet documented clinical genetic testing was disparate in the different ancestry groups. African-ancestry women experience a testing gap that is not fully explained by socioeconomic position, implicating structural barriers in access and referral. Population-level strategies that decouple carrier identification from current referral pathways may be required to close this gap.

10

Whole-exome-based preconception carrier screening in Uzbekistan with targeted SMA, FMR1, and DMD assays: the first reported clinical program

Kullyev, A.; Avdeichik, S.; Akimenkova, A.; Kartuesov, A.; Kardymon, O.; Goikhman, Y.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.02.26354713 medRxiv

Top 2%

0.4%

Show abstract

Abstract Purpose: Published clinical outcome data on preconception carrier screening (PCS) in Central Asia are limited. We report the first clinical implementation study from Uzbekistan of a whole-exome sequencing (WES)-based multi-platform PCS program combining exome sequencing with targeted SMA, FMR1, and DMD assays. Methods: We retrospectively analyzed anonymized data from 65 individuals (19 couples, 27 singletons) screened at IMC Genomics, Tashkent, between January 2024 and May 2026. WES covering the protein-coding regions of approximately 20,000 genes was followed by exome-wide bioinformatics filtering and clinical geneticist interpretation. Partly overlapping cohorts underwent SMA carrier screening (n=179), FMR1 CGG-repeat analysis in females (n=155), and DMD deletion/duplication testing in preconception females (n=29). Variants were classified by ACMG/AMP criteria against gnomAD v4.1. Results: Sixty-one of 65 WES-screened individuals (93.8%; 95% CI 85.2 - 97.6%) carried at least one reportable variant (152 instances across 126 genes). Four of 19 couples (21.1%; 95% CI 8.5 - 43.3%) were concordant for pathogenic or likely pathogenic variants in the same autosomal recessive gene; two were referred for preimplantation genetic testing for monogenic disease. SMA screening identified four carriers, including two 2+0 silent carriers; FMR1 analysis identified one intermediate allele; DMD MLPA identified no exonic rearrangements. Conclusion: This first reported WES-based multi-platform PCS program in Uzbekistan was feasible and clinically informative, identifying actionable couple-level reproductive risks and supporting structured implementation of reproductive genetic screening in Central Asia.

11

Reproductive health in Mexican women with systemic lupus erythematosus: pregnancy outcomes, menstrual irregularities and early menopause

Sevilla-Parra, G.; Bravo-Garcia, F.; Mier y Teran Guevara, M.; Montes-Garcia, A.; Schäfer, A.; Ochoa-Rodriguez, N.; Bienvenu Caballero, M.; Gonzalez Zenteno, S. G.; Pena-Ayala, A.; Tinajero-Nieto, L.; Torres-Valdez, E.; Martinez, D.; Hernandez-Ledesma, A. L.; Medina-Rivera, A.; Alpizar-Rodriguez, D.

2026-06-09 sexual and reproductive health 10.64898/2026.06.07.26354004 medRxiv

Top 2%

0.3%

Show abstract

Objective: To characterize pregnancy outcomes and menstrual irregularities in Mexican women with systemic lupus erythematosus (SLE) and identify clinical factors associated with adverse pregnancy outcomes and early-onset menopause. Methods: We conducted a cross-sectional study of women with SLE enrolled in the Mexican Lupus Registry (LupusRGMX) between May 2021 and September 2024. Clinical and reproductive data were collected using standardized questionnaires. Menopause was defined as the absence of menstruation for [≥]12 consecutive months, and early menopause as onset before age 40. Univariable and multivariable logistic regression analyses were used to identify factors associated with pregnancy complications and early menopause. Results: A total of 210 women were included. Median age was 38 years (IQR 29-46) and median disease duration was 4 years (IQR 1-10). Among women with a history of pregnancy (47%), full-term delivery predominated (61%), while pregnancy loss occurred in 26% and preterm delivery in 13%. Pregnancy complications were reported in 9.6%, most commonly preeclampsia (6.7%). Younger maternal age was independently associated with pregnancy complications (OR 0.89, 95% CI 0.83-0.95) and adverse outcomes (OR 0.95, 95% CI 0.92-0.98). Higher disease activity was associated with complications in univariable analysis. Most pregnancies (68.3%) occurred before diagnosis. Early menopause was observed in 6.2% and independently associated with longer disease duration and older age. Conclusion: Younger maternal age was independently associated with adverse pregnancy outcomes, whereas disease activity showed an association in univariable analysis. Most pregnancies occurred prior to SLE diagnosis. Early menopause was associated with longer disease duration, suggesting impact of cumulative disease burden on ovarian function.

12

Heterozygous MMACHC burden variants are associated with higher circulating vitamin B12 in the All of Us Research Program

Cai, L.; DeBerardinis, R. J.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354855 medRxiv

Top 2%

0.3%

Show abstract

Heterozygous carriers of autosomal recessive disease variants are conventionally considered unaffected, yet population-scale genomic datasets reveal subclinical carrier phenotypes. MMACHC encodes a cobalamin-processing protein whose biallelic loss causes cobalamin C deficiency, an inborn error of intracellular cobalamin metabolism. We performed an unbiased quantitative phenome-wide association screen in All of Us Research Program v8 to identify phenotypes associated with rare heterozygous MMACHC burden variants. Serum/plasma vitamin B12 was the top quantitative association. Carriers had higher circulating B12 than non-carriers in adjusted analyses, but also higher homocysteine, suggesting that elevated circulating B12 does not reflect improved intracellular cobalamin function. Carriers were less likely to fall below conventional B12 insufficiency thresholds, indicating a potential diagnostic blind spot. A pathway-wide rare-variant gene-burden (All-by-All) gene-burden analysis placed this finding in broader biological context. Burdens in genes related to circulating B12 binding or intestinal absorption were associated with lower circulating B12. In contrast, burdens in several genes involved in cellular delivery and intracellular cobalamin handling were associated with higher circulating B12. This step-specific directionality supports a model in which elevated circulating B12 can reflect impaired cellular handling and consequent systemic accumulation rather than improved cellular cobalamin availability. Because EHR-derived B12 is shaped by heterogeneous clinical and medication contexts, prospective carrier-enriched studies with standardized methylmalonic acid, homocysteine, diet, supplement, medication, comorbidity, and symptom ascertainment are needed to evaluate functional-marker-based screening.

13

Rare neurological and neurodevelopmental variants in ALS link to onset, survival and family history

O'Donoghue, C.; Kacar, E.; Gomes, T.; Costello, E.; Pender, N.; Peelo, C.; Ryan, M.; Heverin, M.; Byrne, S.; Bede, P.; Hardiman, O.; McLaughlin, R. L.; Byrne, R. P.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26354977 medRxiv

Top 2%

0.3%

Show abstract

Background: Neurological, neuropsychiatric, and neurodevelopmental disorders cluster in ALS families, sharing a common genetic architecture with ALS. Pathogenic variants in genes associated with other neurological, neurodevelopmental, or neuropsychiatric disorders may also co-occur in ALS and modify phenotype. We have sought to determine the prevalence and clinical pattern of likely-pathogenic/pathogenic (LP/P) non-ALS neurological, neurodevelopmental, and neuropsychiatric variants, alone and in combination with ALS-gene variants, in two large ALS cohorts. Methods: Whole-genome sequencing (WGS) of 469 Irish and 774 Answer ALS people with ALS (pwALS) was analysed for ClinVar LP/P variants associated with other neurological (n = 15541), neurodevelopmental (n = 9761), and neuropsychiatric (n = 321) phenotypes. Inheritance patterns for associated genes (autosomal recessive/autosomal dominant) along with the associated phenotype were validated using OMIM. Standardised clinical data included family history, site and age of onset, El Escorial category, survival, motor decline, and cognitive and behavioural assessments. Known ALS-gene variants and C9orf72 repeat expansion status were included for each cohort. Results: Non-ALS neurological variants were identified in 47/469 (10.0%) Irish and 69/774 (8.9%) Answer ALS participants, most frequently in hereditary spastic paraplegia-associated genes (3.2% Irish; 2.8% Answer ALS). Irish neurological variant carriers showed higher frequency of respiratory onset (10.6% vs 1.2%, Fisher's exact p = 0.002, {Phi} = 0.20) and fewer premorbid behavioural symptoms (0.92 +/- 0.56 vs 3.08 +/- 0.97, Cohen's d = -0.40). Neurodevelopmental variants occurred in 12/469 (2.6%) Irish and 20/774 (2.6%) Answer ALS participants. In the Irish cohort, neurodevelopmental variant carriers had significantly shorter survival in Cox proportional hazards model (log-rank p = 0.005), corresponding to a more than two-fold increased hazard of death (HR = 2.25, 95% CI 1.26-4.00), and had significantly increased familial burden of neuropsychiatric disorders among first- and second-degree relatives (negative binomial IRR for carriers = 2.41, 95% CI: 1.12-5.18, p = 0.025). Across combined cohorts, 18 individuals (Irish n = 8; Answer ALS n = 10) carried [≥]2 LP/P variants spanning ALS and non-ALS genes. Conclusion: Rare LP/P variants in genes associated with other neurological and neurodevelopmental disorders occur in up to 12% of pwALS across two independent cohorts. Carriers show distinct phenotypes, shorter survival, and characteristic family history patterns. These findings suggest that extended pleiotropic and oligogenic architectures may contribute to ALS heterogeneity.

14

Shared epigenetic regulation acting on neuroimmune pathways contributes to the comorbidity between generalized anxiety disorder and COVID-19

Karaca, S.; Cabrera Mendoza, B.; He, J.; Qiu, D.; Davtian, D.; Lacobelle, A.; Nunez, Y. Z.; Krystal, J. H.; Pietrzak, R. H.; Gelernter, J.; Polimanti, R.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354830 medRxiv

Top 2%

0.3%

Show abstract

Background: The biological mechanisms linking generalized anxiety disorder (GAD) and COVID-19 remain poorly understood, despite substantial evidence of their comorbidity. To address this gap, we examined genetic and epigenetic factors underlying their co-occurrence. Methods: In a multi-ancestry sample of 893 participants, we conducted genome-wide and epigenome-wide analyses of GAD and COVID-19 severity. Integrating large-scale genome-wide datasets and information regarding methylation quantitative trait loci, complementary analytic approaches were used to identify regional methylation patterns, assess genetically regulated DNA methylation in blood and brain tissue, and evaluate causal loci shared between GAD and COVID-19. Results: GAD was associated with epigenome-wide significant variation in loci involved in chromatin regulation and synaptic signaling. Conversely, COVID-19-related epigenetic signals were enriched in immune-inflammatory and host-response pathways. Mild COVID-19 was epigenetically related to endothelial-inflammatory signals, while severe COVID-19 was linked to epigenetic changes implicated in myeloid and thrombo-inflammatory pathways. Epigenetic signals shared between GAD and COVID-19 implicated processes related to stress adaptation and tissue homeostasis. Genetically informed analyses identified 60 shared loci, including MAPT, ZFP57, and FBXL18, indicating pleiotropy between GAD and COVID-19 in genetically regulated DNA methylation variation. Brain-specific analyses further highlighted convergence in additional loci (i.e., MICB and HLA-DPB1), suggesting neuroimmune mechanisms underlying GAD-COVID-19 shared methylation patterns. Conclusions: These findings support that GAD and COVID-19 share epigenetic and genetic architecture involving pathways related to vascular integrity, immune function, and cellular adaptation, highlighting a potential neuroimmune basis for their co-occurrence.

15

Multi-ancestry genome-wide association study and meta-analysis of stimulant use disorder reveals biology and relationships to other psychiatric disorders

Beck, S. E.; Deak, J. D.; Levey, D. F.; Ge, T.; Jeffries, P. W.; Lai, D.; Mallard, T. T.; Degenhardt, L.; Lind, P. A.; Tollerup Nielsen, T.; Tubbs, J. D.; Wetherill, L.; Johnson, E. C.; Hatoum, A. S.; The SUD Working Group of the Psychiatric Genomics Consortium, ; COGA Collaborators, ; Yale-Penn Collaboration, ; The VA Million Veteran Program, ; Borglum, A.; Demontis, D.; Medland, S. E.; Martin, N. G.; Nelson, E. C.; Smoller, J. W.; Kranzler, H. R.; Gaziano, J. M.; Stein, M. B.; Agrawal, A.; Edenberg, H. J.; Gelernter, J.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.05.26354997 medRxiv

Top 2%

0.3%

Show abstract

Stimulant use disorder (StimUD) is a significant public health problem, but genetic studies have been limited by small sample sizes. We conducted genome-wide association studies (GWAS) of StimUD in the Million Veteran Program (MVP) and All of Us (AOU), followed by meta-analysis with FinnGen and 10 additional datasets, for a total of 709,369 individuals (Ncases=33,977, Ncontrols=675,392) in four broad ancestry groups: European (EUR) (Ncases=22,564, Ncontrols=624,672), African (AFR) (Ncases=7,574, Ncontrols=34,189), Admixed American (AMR) (Ncases=3,657, Ncontrols=15,698), and East Asian (EAS) (Ncases=182, Ncontrols=833). Population-specific SNP heritability was 6.1% in EUR and 2.4% in AFR. We discovered a total of 19 genome-wide-significant loci, six in EUR, including DRD2*rs5794864, P=7.32E-10, one in AFR, five in a multi-ancestry meta-analysis, including CHRNA5*rs55781567, P=3.27E-9, two in a male-only meta-analysis, including FTO*rs8057044, P=9.50E10-9, and five in a meta-analysis of sex-stratified results. In a hold-out AOU subsample (NEUR=18,841, NAFR=12,263, NAMR=9,739), ancestry-specific polygenic risk scores were significantly associated with StimUD in EUR (OR=3.28, 95% confidence interval (CI)=2.89-3.71) and AMR (OR=2.01, 95% CI=1.71-2.37). Transcriptome-wide association studies, fine-mapping, and colocalization analyses prioritized additional genes (e.g., GPX1, BSN). Genetic correlation, Mendelian randomization, and causal mixture analyses revealed relationships with other substance use and use disorder phenotypes, including cannabis use disorder (rg=0.94, P=5.43E-237) and opioid use disorder (rg=1.01, P=4.40E-107), and other psychiatric traits, including anxiety, depression, neuroticism, and attention-deficit/hyperactivity disorder. This is the first well-powered GWAS of StimUD, and it offers significant insights into disease biology.

16

Incremental Clinical Value of Single-Molecule Nanopore Sequencing in Thalassemia Testing: A Prospective Double-blind, Multicenter Study

Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.

2026-06-09 hematology 10.64898/2026.06.09.26354559 medRxiv

Top 2%

0.2%

Show abstract

Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.

17

Human genetic evidence links serine biosynthesis to diabetic peripheral neuropathy

Fridman, V.; Kakar, A.; Jensen, A.; Van de Vondel, L.; Wheeler, A.; Phillips, L. S.; Zhou, J.; Zuchner, S.; Reusch, J.; Raghavan, S.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355286 medRxiv

Top 2%

0.2%

Show abstract

Diabetic peripheral neuropathy (DPN) is a common and disabling condition for which no disease-modifying therapies are available. Glycemic and metabolic drivers do not fully explain why only a subset of individuals with diabetes develop DPN, and genetic contributors remain poorly defined. We aimed to perform a multi-population genome-wide association study (GWAS) of DPN to highlight potential new etiological pathways and therapeutic targets. Methods We performed a multi-population GWAS of neuropathy in people with and without diabetes using the VA Million Veteran Program and UK Biobank, followed by replication in the All of Us Research Program (AoU), and gene-based and gene-set analyses to identify implicated pathways. Causal relationships between circulating serine levels and DPN were further tested using two sample Mendelian randomization. To further evaluate pathogenic potential, we analyzed rare, high impact variants in GWAS implicated genes among individuals with unresolved inherited neuropathies using the GENESIS platform. Findings Among individuals with type 2 diabetes, we identified seven genome wide significant loci (p<5x10-): PHGDH and PSPH (key serine synthesis genes), TEAD1, CYP4F11, LARGE1, FTO, and COBLL1. No loci were significant in individuals without diabetes or with type 1 diabetes. Four loci (PHGDH, TEAD1, FTO and CYP4F11) replicated in AoU (p <0.05). Mendelian randomization demonstrated that higher genetically predicted serine levels were associated with lower DPN risk, consistent with a causal role of serine metabolism in disease pathogenesis. Rare-variant burden analyses revealed associations of predicted deleterious variants with inherited neuropathy case status in PHGDH (odds ratio [OR] 12.7 [95% CI 7.9, 20.4]), PSPH (OR 8.5 [7.2, 10.2]), PHKG1 (OR 4.8 [3.7, 6.3]), and LARGE1 (OR 0.007 [0.0004, 0.1]). Interpretation Convergent genetic evidence across common and rare variation implicates serine synthesis as a key pathway in DPN. These findings link diabetic and inherited neuropathies through a shared metabolic mechanism, identifying serine metabolism as a potential therapeutic target.

18

Polygenic risk of cardiovascular disease manifests in cardiac structure and function

Felici, B.; Ritchie, S. C.; Khullar, S.; Foguet, C.; Persyn, E.; Manikpurage, H. D.; Liu, Y.; Lambert, S. A.; Ip, S.; Rudd, J. H. F.; Inouye, M.

2026-06-08 cardiovascular medicine 10.64898/2026.06.07.26354998 medRxiv

Top 3%

0.2%

Show abstract

Cardiovascular diseases (CVDs) are highly heritable, but pathogenesis at the organ and physiological level is still poorly defined. Polygenic risk scores (PRSs), which estimate individual genetic susceptibility to a disease, may allow for the identification of associated abnormal organ structures. Ultimately, identifying where cardiovascular polygenic risk manifests can guide early interventions, shape mechanistic hypotheses, and motivate prevention trials for cardiac remodelling. This study investigated the association between PRSs for five common CVDs [heart failure (HF), coronary artery disease (CAD), atrial fibrillation (AF), abdominal aortic aneurysm (AAA) and ischaemic stroke (IS)] and 28 imaging-derived phenotypes (IDPs) from cardiac magnetic resonance imaging of ~62,000 participants in UK Biobank. To investigate the cardiac features associated with elevated polygenic risk of CVDs, we tested CVD PRSs against cardiac IDPs and identified 97 significant associations (FDR [≤] 0.05). We further identified 32 significant putative mediators between CVD PRSs and incident disease events, revealing that across CVDs, polygenic risk manifested as distinct patterns in cardiac structures. HF implicated all cardiac chambers, including left ventricular and left atrial dysfunction alongside enlarged aorta. AF was characterised by biatrial enlargement and reduced ejection fractions, most prominently in the left atrium but also involving left ventricular wall thickness. IS exhibited left ventricular hypertrophy and left atrial dysfunction, while CAD predominantly involved left ventricular hypertrophy. AAA was primarily characterised by enlarged descending aorta. Overall, cardiac IDPs mediated a substantial proportion of polygenic risk for CVDs, in particular for HF. Taken together, our results show that cardiac structure and function lie on the pathway between polygenic risk and cardiovascular events.

19

A mechanistic model for genetic regulation of postmenopausal bone loss

Rattsev, I.; Mac Gabhann, F.; Hertz, D.; Taylor, C. O.

2026-06-08 endocrinology 10.64898/2026.06.04.26354968 medRxiv

Top 3%

0.2%

Show abstract

Bone remodeling is a tightly regulated physiological process that maintains bone health through coordinated action of bone-resorbing osteoclasts and bone-forming osteoblasts. Disruption of this balance, such as the one induced by estrogen decline after menopause, results in bone loss and osteoporosis. Genetic factors play an important role in determining bone mineral density (BMD) loss over time. However, translating genetic associations into individualized risk prediction remains challenging due to small effect size of individuals variants and non-linear interactions within the bone remodeling unit. Here, we present a bone cell population dynamics model that includes major regulatory pathways, such as the RANK/RANKL/OPG axis, Wnt signaling, and hormonal regulation by estrogen, parathyroid hormone, and TGF-{beta}. We calibrate the model on clinical data from healthy postmenopausal women, and women with reduced BMD undergoing anti-osteoporotic therapy. The calibrated model captures healthy BMD decline in postmenopausal women and therapeutic response to anti-osteoporotic medications. We mechanistically incorporate the effect of 22 variants across 8 genes involved in bone remodeling and simulate BMD trajectories in 1,000 virtual subjects differing by ancestry and genetic makeup. The median predicted 5-year BMD loss was 3.57% (95% prediction interval: 1.31-5.24), consistent with the values reported in the literature. The virtual individuals with African ancestry were predicted to experience the highest average 5-year BMD loss. The strongest genetic risk factors for bone loss were predicted to be CYP19A1 rs727479 and OPG rs3102735, while LRP5 rs11228240 emerged as a protective factor that could partially counteract the detrimental effects of other variants. Several epistatic effects were observed in the genetic interaction analysis. Mechanistically, our model suggested that estrogen exerts its effect on bone remodeling primarily by modulating osteoclast apoptosis. Overall, this framework demonstrates a proof-of-concept for integration of genetic risk factors into mechanistic models of disease and can be extended to other conditions with polygenic inheritance.

20

Parental educational attainment polygenic scores contribute to phenotypic heterogeneity in offspring with autism

Gao, S.; Sui, Y.; Tian, P.; Rao, X.; Yan, C.; Xu, Y.; Wang, T.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.03.26354779 medRxiv

Top 3%

0.2%

Show abstract

Educational attainment-related polygenic scores have been implicated in autism spectrum disorder (ASD), but how parental polygenic scores shape offspring phenotypes remains unclear. Using genotyping and exome-sequencing data from 142,357 individuals (55,252 ASD cases) in a large ASD cohort, we dissected the direct and indirect genetic effects of educational attainment-related polygenic scores on ASD phenotypes. Trio-model analyses showed that parental polygenic scores for educational attainment (PGSEA ) were associated with milder core ASD symptoms, including social deficits and repetitive behaviors, predominantly through indirect genetic effects, whereas their associations with comorbidities were driven predominantly by direct genetic effects. PGSEA was also significantly negatively associated with rare variant burden and prenatal factors, although these factors contributed largely independently to most phenotypes. Adjustment for full-scale intelligence quotient (FSIQ) and socioeconomic status (SES) partially attenuated the indirect effects of PGSEA on offspring phenotypes. Finally, higher parental PGSEA was associated with later age at diagnosis in offspring, partly through its protective effects on ASD phenotypes. These findings indicate that indirect genetic effects of parentalPGSEA contribute substantially to phenotypic variation in ASD and highlight family-mediated pathways as an important component of ASD heterogeneity.